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i.  ,:troduction 


1.1  Goals 

The  object  of  this  research  is  to  pursue  the  question  of  whether  it 
is  possible  to  develop  a ;  intelligent  computer  system  that  both  understands 
and  writes  programs .  The  research  includes  high-level  methods  of  specifying 
programs,  codification  of  programming  knowledge,  and  implementation  of 
working  program-writing  systems .  The  domain  of  programming  knowledge 
ranges  from  the  fundamentals  of  programming  through  list  processing  to 
simple  searching,  sorting,  and  inductive- inf erence  programs.  Much  of  this 
knowledge  is  more-or-less  pure  programming  knowledge,  along  with  such 
domain-dependent  knowledge  as  is  necessary.  A  major  emphasis  is  the 
codification  of  the  considerable  body  of  list-processing  and  fundamental 
programming  knowledge.  In  the  implementation  aspect  of  our  research,  an 
eventual  target  system  is  expected  to  have  a  deep  understanding  of 
programming  as  demonstrated  by  its  program-writing  ability,  its  line  of 
reasoning  in  creating  a  program,  and  its  ow.<.  discussion  of  why  it  made 
each  choice  and  what  factors  were  involved. 


One  of  our  earliest  efforts  was  an  exploration  of  more  "human"  methods 
of  program  specification,  such  as  example  input-output  pairs,  program 
traces,  and  generic  examples.  In  the  area  of  codification  of  programming 
knowledge,  we  have  developed  sets  of  rules  for  program  synthesis  that 
cover  low-level  list  and  register  operations,  several  types  of  generate 
and  process  paradigms,  and  simple  searching  and  sorting  programs.  We  have 
implemented  7  different  programs  that  do  all  or  part  of  the  job  of  program 
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synthesis.  The  more  recent  programs  have  been  moderately  successful.  In 
particular,  they  can  (l)  write  list-transformation  programs,  given 
example  input-output  pairs;  (2)  write  low-level  list-  and  register- 
manipulation  programs;  (3)  write  3  sorting  and  permutation  programs; 
and  (4)  write  a  concept-formation  program. 

1.3  Organization  of  the  Report 

The  reader  should  note  that  material  in  this  progress  report  is 
presented  roughly  chronologically  and  that  some  of  our  false  starts  have 
been  included  for  historical  completeness.  Consequently,  our  later  (and 
hopefully  more  successful)  work  is  presented  towards  the  end  of  the  paper. 
Some  readers  might  wish  to  scan  the  first  parts  and  focus  on  Sections  4.4 
through  4.7. 

Section  2  of  this  report  represents  our  initial  explorations  into 
declarative  methods  for  specifying  procedures.  These  methods  include  both 
individual  and  generic  examples  of  input-output  pairs  to  which  the  program 
being  specified  must  conform;  traces  of  the  input (s),  output,  and  perhaps 
intermediate  values  throughout  the  execution  of  the  program;  high-level 
programming  operations  and  concepts  expressed  in  English  words  and  phrases; 
and  combinations  of  these.  As  a  result  of  the  conciseness  of  such  program 
descriptions,  they  are  often  incomplete  or  ambiguous.  Some  of  the  methods 
of  Section  2  are  utilized  in  the  running  systems  discussed  in  Section  4. 

Section  3  is  a  brief  discussion  of  what  we  view  as  one  of  the  most 
important  aspects  of  research  in  automatic  programming:  the  codification 
of  programming  knov.'ledge  so  that  it  can  be  used  by  a  system  which  understands 
and  writes  programs.  Concrete  examples  of  such  knovrledge  are  given  in 
Section  4  for  some  of  the  systems  currently  implemented. 

£. 


Section  4  embodies  the  history  of  actual  program-understanding 
systems  which  have  been  implemented  by  our  group  over  the  past  year  and 
one  half.  These  systems  span  a  wide  range  of  input-specification  types, 
built-in  programming  and  task-domain  knowledge,  and  target -program 
complexity,  but  tney  all  have  the  programming  dexnain  of  list  processing 
in  common.  Although  the  systems  are  discussed  in  chronological  order 
for  the  sake  of  continuity,  the  reader  should  note  that  our  most  recent 
and  continuing  efforts  involve  the  final  5  systems  [see  Sections  4.5 -4.7]. 


2.  METHODS  OF  PROGRAM  SPECIFICATION 

One  of  our  goals  is  to  find  "better  ways  for  people  to  specify  programs. 
A  central  question  is  whether  or  not  there  exist  any  methods  or  languages 
that  are  better  than  those  that  currently  exist.  It  is  possible  that, 
say,  ALGOL  is  the  best  language  for  specifying  a  particular  algorithm. 
However,  it  seems  that  for  certain  programs  we  can  find  new  descriptions 
that  are  easier  for  people  to  use.  Certainly,  very  good  special-purpose 
languages  can  be  designed  for  particular  application  areas. 

We  will  present  a  few  methods  for  specifying  list-processing  programs. 
It  is  not  yet  clear  which  methods  are  suited  to  which  classes  of  programs. 
Some  methods  considered  so  far  are  examined  below.  Most  of  these  have 
evolved  from  discussions  within  our  group.  McCune  has  contributed  the 
most  recent  efforts  at  analyzing  and  cataloging  them. 

In  general,  our  target  user  is  a  person  familiar  with  programming  and 
the  subject  domain  of  the  desired  program,  but  not  necessarily  with  the 
details  of  that  program  or  the  language  in  which  it  is  to  be  implemented. 

2.1  Example  Input -output  Pairs 

Grammatical  inference  and  the  inference  of  automata  from  ordered  pairs 
representing  example  input-output  behavior  have  been  investigated 
[5>^>7>12>17] •  Example  input-output  pairs  can  similarly  be  used  to 
describe  low-level  list- transformation  algorithms  [1,14], 

Consider  the  program  that  "flattens1’  a  list.  An  example  of  its 


behavior  is  as  follows: 


input  output 

(A  (B  C  (D)  E))  - >  (ABODE) 


This  example  pair  is  quite  simple  to  write  and  to  most  people  specifies 
the  desired  effect  of  the  program,  hut  not  the  detailed  operation.  Note 
that  if  we  add  the  phrase  "remove  inner  parentheses"  to  the  input-output  pair 
description,  the  intent  is  even  clearer  Of  course,  we  still  don't 
know  whether  to  create  a  new  list  or  modify  the  input  list  unless  this, 
too,  is  specified. 

Another  list  transformation  that  is  easily  specified  by  example  is 
input  output 

(A  B  C  D)  - >  ((A  B)  (A  C)  (A  I>)  (B  C)(B  D)  (C  D)) 

which  describes  the  generation  of  all  2-element  combinations  from  a  list. 

A  simple  observation  is  that  several  I/O  pairs  may  be  required  to 
specify  a  program  (actually  a  class  of  equivalent  programs)  unambiguously. 

One  disadvantage  of  this  method  is  that  the  program  inferred  by  the  system 
may  not  be  the  intended  program.  Also,  examples  have  to  be  carefully 
chosen.  Hopefully  the  program-writing  program  will  have  some  model  of 
human  preferences  and  will  not  infer,  say,  the  function  having  the  constant 
output  (ABODE)  from  the  flatten  example  given  above.  In  cases 
where  it  is  difficult  to  disambiguate  the  intended  program  using  only  examples, 
other  information  sources  could  be  used.  These  include  programming  context 
and  simple  descriptors  like  "a  recursive  function,  not  merely  table  look-up". 
The  program -writing  program  should  verify  with  the  user  that  its  choice 
of  program  is  what  the  user  intended.  One  way  to  do  this  is  to  automatically 
generate  for  the  user  a  new  l/o  pair  that  disambiguates  among  the  major 


candidates . 


In  any  program-specification  method  that  requires  some  inference  on 
the  part  of  the  computer*  there  will  be  a  chance  that  the  computer  will 
synthesize  the  wrong  program.  This  lack  of  control  is  especially  upsetting 
to  good  programmers.  However*  high-level  specifications  are  invariably 
inexact*  which  leads  to  the  need  for  inference  to  fill  in  details.  More 
research  on  this  problem  area  is  required*  but  any  solution  would  seem  to 
require  a  high  degree  of  2-way  dialog  between  the  user  and  the  system. 

2,2  Program  Traces 

Some  work  has  been  done  on  the  inference  of  programs  from  traces  [2]. 
This  method  is  more  complete  than  example  i/o  pairs  in  that  it  tends  to 
describe  the  algorithm  used  to  compute  the  output*  as  well  as  the  input- 
output  relation .  Thus  the  pair 

input  output 

(3  14  2)  - »  (12  3  4) 

specifies  a  sort.  But  the  trace  of  the  input  and  output 


input 

output 

initially: 

(3  1  4  2) 

0 

next: 

(14  2) 

(3) 

next: 

(4  2) 

(1  3) 

next: 

(2) 

(1  5 

finally: 

0 

(1  2  3  4) 

implies  an  insertion-sort  algorithm*  with  details  omitted. 

We  would  like  to  emphasize  a  new  aspect  of  program  inference  from 
traces*  namely,  the  utilization  of  several  knowledge  wOurccs  to  ViTits  the 
program.  These  sources  include  the  subject  domain  for  which  the  program 
is  written*  a  knowledge  of  what  the  common  operations  are,  and  other 
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specifications  that  are  given  for  the  program.  The  user  could  supply 
further  information  by  annotating  the  traces  to  provide  disambiguation 
or  further  specification,  as  is  discussed  in  Section  2.9*  An  example  of 
additional  specification  of  the  sort  program  above  might  be  the  word 
"recursive" . 

2.3  Generic  Examples 

Generic  examples  lie  somewhere  between  example  I/O  pairs  and  formal 
predicate-ca?.culus  i/O  specifications  [13,32]  in  explicitness.  The 
ellipsis  notation  is  used  to  specify  an  indefinite  number  of  elements. 

For  example,  the  specification 

input  output 

(X1  X2  x3  '  * '  Xn)  >  (Xn  Vl  Xn-2  * "  xl^ 

gives  the  reverse  function.  The  alternate  function  may  be  specified  by 

input  output 

(xx  Xg  x?  xk  x_  ...)  - »  (x1  x?  x5  ...) 

This  notation  is,  of  course,  ambiguous,  and  a  verification  phase  would 
have  to  confirm  the  hypothesised  program. 

2.4  Generic  Traces 

Similarly,  the  ellipsis  notation  can  be  used  in  a  trace.  As  an 
example,  here  is  a  generic  trace  which  specifics  the  combinations  of 


elements  of  a  set  taken  2  at  a  time: 


output 


car (input)  cdr( input) 
xx  (x2  x5  x4  ...)  ((X;L  x2)(x1  X^CXj^  x^)...) 

x2  (x5  x^  . . . )  ( (a^  x2)  (x±  x..)  (x1  x^) . . .  (x2  x*)  (x2  x^) . . . ) 


2.5  Graphical  Descriptions 

Pictures  of  input  and  output  are  obviously  well  suited  for  depicting 
simple  list  transformations  in  which  the  structures  are  difficult  to  describe 
in  linear  strings,  yet  easy  to  describe  in  2  dimensions  [19] .  We  have  not 
investigated  any  of  these  methods. 


2.6  Conceptual  Descriptions 

High-level  program  description  is,  of  course,  the  most  convenient 
specification  technique  if  the  right  high-level  primitives  are  available. 

In  the  extreme  case  we  would  just  give  the  name  (or  number)  of  the  desired 
program.  More  interesting  cases  for  automatic -programming  studies  are 
those  in  which  there  is  sane  distance  from  the  primitives  to  the  program 
description. 

For  the  domain  being  considered  (list  transformations),  nice  conceptual 
descriptors  (primitives)  include  "element  conserving",  "order  preserving", 
"represents  a  set",  "represents  a  tree",  "represents  a  graph",  "permutation", 
"table  look-up",  etc.  These  can  be  embedded  in  either  inherently  ambiguous 
or  unambiguous  languages  (ranging  from  versions  of  English  to  unambiguous 
high-level,  but  conventional,  programming  languages)  and  can  either  partially 
or  completely  specify  the  program.  We  would  like  to  emphasize  partial 
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descriptions,  ambiguous  languages,  and  primitives  that  are  not  quite 
high-level  enough  to  make  the  task  too  easy  for  the  system.  By  combining 
several  ambiguous  partial  descriptions  with  knowledge  of  the  programming 
domain,  a  system  may  be  able  to  decipher  descriptions  that  humans  can 
easily  produce.  (Conventional  programming  languages  that  are  completely 
descriptive  and  unambiguous,  but  lacking  primitives  of  a  high  enough  level 
are  still  of  interest.) 


2.7  Natural-language  Descriptions 

As  used  by  documentors  and  describers  of  algorithms  [19],  natural 
(English)  language  mixed  with  mathematical  and  programming  jargon  can  be 
an  effective  method  for  communicating  an  algorithm.  Good  English-like 
program  descriptions  can  be  easily  understood  by  humans,  although  again 
it's  not  clear  under  what  circumstances  they  are  the  easiest  descriptions 
to  generate.  English  descriptions  can,  of  course,  describe  input-output 
relations  or  algorithms,  be  partial  or  complete,  high-level  or  low-level, 
interactive  or  not,  etc.  Here  is  an  example  of  a  partial  algorithm 
specification  [20]: 


An  exchange  sort.  If  two  items  are  found  to  be  out  of 
order,  they  are  interchanged,  xhis  process  is  repeated 
until  no  more  exchanges  are  necessary. 


We  intend  to  examine  the  issues  of  when  English  is  a  useful  adjunct 
in  program  description  and  how  a  programming  system  might  deal  with  it. 
Elschlager  is  studying  natural-language  descriptions  of  programs  in  order 
to  develop  an  appropriate  internal  representation  for  them .  From  this  hfu 
come  a  representation  which  is  primarily  relational,  but  also  has 


qualification  and  quantification  primitives.  Possible  inputs  into  this 
system  might  be  either  a  limited  subset  of  English  or  a  more  rigidly 
structured  "parenthesized"  English.  Future  work  includes  relating  this 
internal  representation  of  a  program  to  the  programming  concepts  and 
data  structures  it  will  use. 

2.8  User -machine  Dialog 

A  simple  and  useful  vehicle  for  embedding  the  various  methods  of 
program  specification  is  conversational  interaction  or  dialog  between 
the  user  and  the  automatic -programming  system.  Such  a  style  was  used  by 
Floyd  [10]  in  his  suggestion  for  a  system  aimed  at  interactive  program 
verification  and  synthesis.  With  such  a  systan  there  is  a  need  for  a 
large  amount  of  programming  knowledge  of  both  a  domain-specific  and 
general  nature.  The  system  would  draw  on  this  knowledge  in  order  to 
understand  the  user's  remarks  and  pose  meaningful  questions. 

One  could  imagine  a  conversation  beginning  with  the  user's  specification 
of  the  general  program  desired  and  proceeding  in  a  basically  top-down 
manner.  As  the  system  needed  more  precise  information,  it  would  ask  for 
it.  Presumably  the  user  could  indicate  a  preference  for  a  particular 
programming  decision  (e.g.,  choice  of  data  representation)  or  possibly  a 
lack  of  preference,  in  which  case  the  system  would  make  its  own  reasonable 
choice.  A  variety  of  methods  of  specification  (e.g.,  examples,  traces, 
formal  specifications,  etc.)  might  play  a  role  in  such  a  conversation. 

This  yields  a  paradigm  similar  to  that  of  structured  programming  [6],  but 
one  in  which  the  computer  fills  in  the  details . 

A  typical  low-level  dialog  might  proceed  as  follows  (for  readability 


the  dialog  is  mostly  English) : 


USER:  Please  write  me  a  program  which  extracts 
information  from  pairs  in  an  input  list. 

MACHINE:  What  will  the  input  be? 

USER:  The  input  will  consist  of  2  parts:  an 
atom  and  a  list  of  pairs. 

MACHINE :  What  should  the  output  be? 

USER:  A  list  of  the  other  half  of  the  pairs  which 
contain  the  input  atom. 

MACHINE:  What  do  you  mean  by  "the  other  half  of 
the  pairs"? 

USER:  If  the  first  element  of  one  of  the  pairs 
is  the  input  atom,  I  want  the  second 
element;  if  the  second  element  is  the 
input  atom,  I  want  the  first  element. 

MACHINE:  What  shall  be  done  if  one  of  the  pairs 
is  (x  x)  ,  where  x  is  the  input  atom? 

USER:  Then  the  output  list  should  include  x  . 

MACHINE:  OK,  I  think  that’s  enough. 

[pause] 

MACHINE :  Wait  a  minute.  Does  the  order  of  the 
elements  in  the  output  list  matter? 

USER:  No,  not  at  all. 

MACHINE:  Then  I  will  simply  use  the  order  in  which 
the  pairs  occur  in  the  input  list,  since 
that  is  the  easiest.  In  this  case  I  could 
overwrite  the  input  list.  Should  I  do  that, 
or  would  you  rather  that  the  list  be  preserved? 

USER:  Preserve  it. 

[pause] 

MACHINE:  Do  you  want  this  written  as  a  function 
of  the  2  input  arguments? 

USER:  Sure,  call  the  function  f  . 


« 


MACHINE:  How’s  this: 

f(x,  y)  *-  if  null(y)  then  nil  else 

if  x  =  caar(y)  then  consfcadar(y),  f(„.,  cdr(y)))  else 
if  x  =  cadar(y)  then  cons(caar(y),  f  (x,  cdr(y)))  else 
f^x,  cdr(y)); 

USER:  Looks  good  to  me. 


The  above  dialog  is  at  such  a  low  level  that  it  is  completely  dependent 
upon  the  fact  that  LISP  lists  are  being  dealt  with,  thus  requiring  the  user 
to  have  some  familiarity  with  LISP.  Higher-level  dialogs  of  domain-specific 
programs  can  be  less  representation  dependent  and  can  be  carried  on  in  the 
vocabulary  of  the  particular  problem  domain. 


2.9  Information  Necessary  to  Complete  bhe  Specification  of  a  Program 
In  completing  the  specification  of  a  program,  we  can  imagine  a 
"checklist"  that  a  program-writing  system  might  have  for  each  type  of 
program  it  can  handle.  It  might  work  on  completing  its  checklist  by 
inference  from  partial  specifications,  interactions  with  the  user,  context, 
and  default  conditions.  Such  a  checklist  might  include  terminating 
conditions,  auxiliary  functions,  restrictions  on  input  (e.g.,  whether  a 
list  has  constant  or  variable  length) ,  what  data  representations  are 
available,  etc.  Certainly  a  program-understanding  system  needs  to  ask 
many  questions  about  the  target  program.  (But  not,  "What's  the  first 
instruction?  Now,  what's  the  second?  ...") 


2.10  A  Comparative  Example 

Lot's  consider  the  specification  of  a  simple  program  as  a  vehicle  for 
discussion  of  the  merits  of  various  methods  of  description.  Consider  the 
following  example  of  the  association  search  synthesized  in  Section  2.8: 


1 


"  mirw  Hh 


input  1  input  2  output 

B  ((A  B)(B  C)(B  E)...)  - -*(AC...) 

Note  that  we’ve  incorporated  the  ellipsis  notation  of  generic  examples 
into  an  example  input -output  pair.  Subjectively,  this  specification  seems 
not  as  thorough  as  we  might  wish.  Can  input  1  be  non-atomic?  What  if 
(B  B)  occurs  in  input  2?  What  if  an  element  of  input  2  is  atomic?  Etc. 

As  the  complexity  of  the  transformation  increases,  example  input -output 
pairs  begin  to  require  more  inference  to  determine  the  intended  transformation 
One  way  out  is  to  clarify  the  intended  function  by  describing  more  elementary 
relations  between  input  and  output  e.‘  ements,  namely,  "The  letters  A  ana  C 
are  in  the  output  because  they  occur  in  the  second  input  paired  with  B 
(the  first  input)".  If  we  allow  a  higher-level  concept,  it  is  even  easier 
to  describe:  "a  commutative  LISP  assoc  operation".  This  phrase 
describes  the  function  fairly  clearly  (to  a  LISP  programmer) .  The  added 
description,  "order  preserving",  explains  why  C  follows  A  in  the  output, 
but  a  reasonable  program  should  assume  (and  teot)  order  preservation  in  the 
absence  of  other  information.  Obviously  the  conceptual  descriptions  alone, 
without  the  example,  do  not  clearly  determine  the  intended  program. 

Together  they  do  a  reasonable  job. 

As  another  more  explicit  technique,  rfcCune  and  Lenat  have  suggested 
describing  the  lower-level  relations  for  the  above  example  graphically 
as,  say, 


This  scheme  clarifies  why  each  element  of  the  output  is  where  it  is 
and  from  where  in  the  input  it  came. 

Of  course,  a  partial  or  even  complete,  but  precise  description  can 
be  given  in  predicate  calculus  [13,32].  Here  is  one  possibility: 


(Y  v,  w,  x,  y,  z)  [input(x,  y)  A  output(z)  A  atom(x) 

A  list(y)  A  list(z)  A  sublist(w,  y)  a  length(w,  2) 

A  member(x,  w)  a  member(v,  w)  a  (x  ^  v  V  Yu  [member  (u,  w)  ^  u  =  v 
s  member(v,  z) 

(Y  t,  u,  v,  w,  x,  y,  z)  [input(x,  y)  A  output(z)  A  list(y) 

A  list(z)  A  member(v,  z)  A  member(w,  z)  A  sublist(t,  y) 

A  sublist(u,  y)  A  member(v,  t)  A  member(w,  u) 

A  before(t,  u,  y) ]  3  before(v,  w,  z) 


(where  bofore(t,  u,  y)  means  element  t  occurs  before  element  u  in 
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5.  CODIFICATION  OF  PROGRAMMING  KNOWLEDGE 

The  easy  part  of  codifying  programming  knowledge  is  the  now  more-or-lesr 
conventional  formal  specification  of  the  semantics  of  each  operation  in 
one's  programming  language  [9,  15,  23]*  The  more  interesting  aspect  is 
the  concrete  specification  of  high-level  programming  constructs  (e.g.,  a 
loop  with  an  exit) ,  and  those  programming  methods  that  are  used  in  the 
process  of  designing  a  program,  hut,  never  appear  explicitly  in  the  program. 

An  example  is  the  detailed  specification  of  sufficient  methods  for  performing 
a  generate-and-test  operation  on  an  implicit  representation  of  a  set. 

Newell  [24]  has  presented  a  fairly  high-level  (non -programmable)  description 
of  5  common  artificial-intelligence  problem- solving  methods,  including 
generate  and  test,  heuristic  search,  hill  climbing,  match,  and  induction. 

Much  of  the  work  in  structured  programming  [6]  has  been  aimed  at 
explicating  such  programming  methodology,  but  has  generally  been  at  too 
high  a  level  for  implementation,  being  aimed  at  human  programmers.  We 
have  begun  to  codify  and  embed  this  type  of  knowledge  in  2  of  our  systems 
[see  Sections  4.6  and  k.7 ]♦ 

Kow  big  a  body  of  knowledge  are  we  interested  in,  and  how  much  detail 
is  needed?  Our  <-rude  preliminary  estimate  is  that  something  like  a  few 
thousand  "facts"  (,xny  convenient  chunks  of  knowledge,  such  as  production 
rules,  axioms,  or  goal  statements)  could  enable  a  program  to  understand 
simple  list-processing  programs.  We  have  generated  a  proposed  set  of 
facts  necessary  for  a  program-understanding  system  to  understand  very 
simple  insertion-  and  select ion -sort  programs.  100  to  200  facts  seem 
adequate,  without  counting  either  the  semantics  of  LISP  or  any  efficiency 
or  optimization  knowledge.  Including  these  other  knowledge  sources  would 


bring  us  to  several  hundred.  Manna  and  Waldinger's  experience  [22]  with 
the  domain  of  pattern  matching  indicates  that  about  75  facts  are  sufficient 
to  enable  the  construction  of  a  unification  algorithm  (leaving  out 
efficiency,  programming- language  semantics,  and  high-level  program- 
construction  concepts) . 

Such  estimates,  crude  as  they  are,  give  us  sin  idea  of  how  smart  a 
program-understanding  system  might  become  in  the  next  few  years;  that  is, 
we  can  expect  a  system  to  deeply  understand  a  very  small  set  of  programs. 

Our  plans  are  to  finish  the  characterization  of  simple  sorting  and 
then  to  consider  simple  tree  searching,  table  look-up,  and  set  operations. 
At  the  same  time  we  will  increase  our  emphasis  on  the  automatic  selection 
of  representations.  These  areas  all  involve  more-or-less  "general'' 
programming  knowledge  and  are  not  too  domain  specific.  Our  first  more 
domain-specific  area  under  attack  is  that  of  concept -format ion  programs 
[18,  a  class  of  inductive-inference  programs  that  encompasses  enough 

general  programming  knowledge  to  be  interesting  for  that  reason.  We  are 
currently  defining  a  set  of  increasingly  complex  concept-formation 
programs  to  pace  our  efforts.  FUP5  [see  Section  U.6]  indicates  that  there 
are  about  75  units  of  knowledge  necessary  to  write  a  concept-formatic"1 
program,  where  each  unit  contains  about  a  dozen  facts. 

It  would  be  nice  to  know  the  size  of  the  body  that  constitute..  the 
"core"  of  programming  Knowledge.  As  yet,  we  can  only  guess.  Finding  the 
knowledge  is  still  a  more-or-less  linear  process;  that  is,  to  add  a  new 
capability  to  an  understanding  system  requires  about  as  much  time  and 
effort  as  it  took  to  add  the  previous  capability.  We  are  beginning  to 
find  come  commonality  in  the  utilization  of  previously  codified  knowledge, 
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but  it's  too  etii" ly  yet  to  make  any  claims  of  great  insight.  However, 
we  do  have  a  fa?"  iegree  of  faith  that  there  is  a  subject -independent 
core  that  we  will  slowly  extract  and  refine. 
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4.  IMPLEMENTATION  OF  PROGRAM-UNDERSTANDING  SYSTEMS 


For  the  sake  of  historical  completeness,  we  will  discuus  3  early 
implementations  that  are  of  limited  significance  before  discussing  our 
later,  more  successful  systems .  Perhaps  the  main  conclusion  to  be  drawn 
from  these  is  that  small  efforts  seem  inadequate  for  serious  progress  in 
program-understanding  systems.  Good  programming  systems  will  be  very 
large  and  complex  and  will  take  many  man-years  of  work. 

4.1  Schema  Instantiation  to  Fit  Example  Input -output  Pairs 

The  first  running  system  in  our  group  was  Lenat's  IV 1,  which  was 
implemented  in  MLISP  [30].  It  takes  as  input  several  example  input-output 
list  pairs  and  produces  as  output  LISP  programs.  The  idea  is  simple: 
most  elementary  programs  in  the  class  of  Interest  have  1  or  2  termination 
conditions  followed  by  a  recursive  call.  The  structure  of  such  a  program 
can  be  given  by  a  few  high-level  schemata. 

The  system  infers  the  number  and  type  of  arguments  by  examining  the 
example  input-output  pairs.  From  the  number  of  arguments  either  the 
1-input  schema  or  the  2-input  schema  is  selected.  The  1-input  schema  is 

f(x)  - 

if  f^x)  =  then  fQ(x)  else  [line  1] 

if  f„.(x)  =  c^  then  fj^x)  else  [line  2] 

^(^(^(x)),  fg(f9(x)));  [line  3] 

where  f  ^  through  fr)  are  functions  and  c^  and  c 2  are  constants, 
all  to  be  determined  later.  Lines  1  and  2  correspond  to  termination 
conditions,  and  line  3  corresponds  to  a  recursive  call. 
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The  user  is  asked  if  the  function  is  recursive,  (if  it  is  not, 
line  5  is  not  used.)  The  default  condition  is  to  assume  a  recursive 
function,  but  no  attempt  is  made  to  guess  that  the  function  is  recursive. 
The  automatic  program  writer  next  determines,  again  by  asking,  whether 
there  are  1  or  2  terminating  conditions  (i.e.,  line  1  only  or  both  lines 
1  and  2)  and  whether  the  user  wants  to  suggest  either  the  test  or  tne 
value  for  lines  1  or  2. 

Whatever  pieces  are  not  supplied  by  the  user  are  filled  in  by  a 
constrained  search  process  that  also  fills  in  the  functions  in  line  5. 

The  search  proceeds  as  follows.  First,  an  ordered  set  of  candidates  is 
formed  for  each  subfunction  and  constant.  The  user  can  give  advice  in  the 
form  of  suggested  subfunctions  that  are  likely  to  occur.  A  second 
information  source  is  the  type  (atom,  list,  or  number)  of  each  argument. 
These  factors  are  combined,  using  a  rating  table  containing  the  probability 
of  each  known  function  appearing  In  a  particular  schema  position,  to  yield 
a  final  ordering.  Then  the  candidate  instances  of  the  schema  are  generated 
one  by  one,  in  accordance  with  the  orderings  of  the  subfuncticns . 

Several  tricks  prune  the  search  space.  A  function  is  not  applied  to 
the  wrong  number  or  type  of  arguments .  To  check  this  the  instantiated 
schema  is  run  on  the  examples,  and  cnecking  occurs  at  every  step  of 
execution.  Infinite  recursions  are  detected  and  prevented.  "Infinity" 
is  a  parameter  set  in  advance,  usually  to  a  number  between  17  and  100. 

The  function  being  defined  may  only  occur  in  line  5,  the  recursion  step, 
and  its  arguments  in  the  recursive  call  cannot  be  the  same  arguments  it 
receives  in  the  original  call.  Some  check  should  be  made  that  the 
arguments  are  somehow  moving  toward  the  termination  form,  but  actually  any 
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perceived,  change  is  allowed.  Several  special  subfunctions,  such  as  the 
identity  function  and  a  projection  (or  selection)  function,  are  provided 
to  enable  the  desired  program  to  be  forced  into  one  of  the  2  given 
Procrustean  beds. 

The  program  is  known  to  have  generated  at  least  8  correct  programs, 
but  run  out  of  time  on  most  other  attempts.  Among  the  programs  IW1 
wrote  are 


function  name 


sub2 


fi 

last 

! 

reverse 

Fibonacci 

i 

factorial 

I 

insert 

I 

sort 

flatten 


function  operation 

subtract  2  from  the  (numeric)  argument 

[from  2  examples:  2  -  0  and  7  -  5  ] 
produce  a  1-element  list  containing  only 
the  last  element  of  the  input  list 

[from  2  examples:  (A  B)  - >  (B)  and 

(ABODE)  - *  (E)  ] 

reverse  a  list  [from  1  example: 

(ABODE)  - *  (EDCB  A)  ] 

the  obvious  [from  3  examples:  1  -*  1  , 

6  -  8  ,  and  7  -  15  ] 
the  obvious  [from  2  examples:  1-1 
and  4  -  24  ) 

insert  a  number  into  its  proper  place  in 
an  ordered  list  of  numbers 

[from  3  examples:  2,  (1  3  8)  - *  (1  2  3  8); 

2,  (8)  - »  (2  8)  ;  and 

7,  (1  5)  - >  (15  7)  j 

sort  a  list  of  numbers,  given  insert  as 
a  primitive  function 

[from  4  examples:  (2  3)  - *  (2  5)  , 

(5  2)  - -V  (2  3)  , 

(1  7  6  It)  - »  (1  4  6  7)  ,  and 

(8  1  2  5  3  9)  - (125589)  ] 

change  a  tree  into  a  single-level  list  of 
the  atoms  in  the  tree  [from  1  example: 

(A  (B  C  (D  E))  F)  - »  (A  B  C  D  E  F)  ] 


This  approach  appeared  to  have  limited  potential,  so  no  controlled 
experiments  were  run.  The  main  disadvantage  was  that  the  program  had  a 
limited  model  of  its  task  and  little  programming  knowledge,  so  it 
consequently  engaged  in  large  searches. 

21 


b.2  Sequence-extrapolator  Writer 


This  was  an  INTERLISP  [31]  program  by  Lenat.  The  question  was 
whether  it  is  possible  to  write  a  highly  specialized  program  writer  that 
produces  programs  for  a  given  sub-area  of  inductive  ■'.ference,  in  this 
case  sequence  extrapolation  [25,  29].  Other  specialized  program-writing 
programs,  like  compilers  and  compiler-compilers,  have  been  around  for 
a  while.  This  new  task  turned  out  to  be  easy. 

The  program  begins  with  a  schema  for  a  generalized  sequence- 
extrapolation  program  consisting  of  5  subparts.  The  user  describes,  via 
a  dialog  directed  by  a  decision  tree,  which  capabilities  are  to  be 
included  for  each  subpart.  (Not  all  choices  are  independent,  however.) 

The  system  then  includes  the  appropriate  pieces  of  program  or  data  that 
meet  this  description.  For  example,  for  the  subpart  of  known  sequences, 
the  user  indicates  which  sequences  should  be  immediately  recognizable  by 
exact  match. 

Not  much  was  learned,  except  that  it  is  possible  to  write  a  highly 
specialized  program  writer  for  this  domain.  We  can  guess  that  it  would  be 
easy  to  turn  out  specialist  program  writers  for  other  simple,  well-structured 
domains.  The  system  had  little  of  the  character  of  what  we  call  an 
understanding  system. 

If. 5  Ellipsis  Translator 

This  was  a  small  study  and  INTERLISF  program  by  Shaw  designed  to 
translate  a  class  of  ambiguous  generic  examples  into  a  list  of  candidate 
unambiguous  internal  representations.  For  example,  the  program  translates 
(Xg  Xj(  +  . . .  +  x^)  into  the  2  unambiguous  interpretations 


and 


1  <  i  <  n/2 


Zj  x  . 
1  <  i  <  logg  n  21 


(although  the  2  interpretations  are  not  represented  internally  in  a  form 
isomorphic  to  the  above) .  The  experimental  program  was  not  pushed,  so  it 
never  left  the  nearly  debugged  stage.  However,  there  are  a  few  comments 
and  observations  we  can  make. 

The  notation  seems  to  be  useful,  and  the  intent  of  the  user  is  often 
easy  to  guess  by  straightforward  techniques.  First,  observe  that  finding 
an  interpretation  reduces  to  sequence  extrapolation  on  the  indices  of  the 
variables.  Sequence-extrapolation  techniques  (25,  29),  including  successive 
differences,  successive  quotients,  and  tests  for  common  sequences,  have 
allowed  the  construction  of  relatively  powerful  sequence  extrapolators 
that  behave  well  and  usually  produce  the  desired  interpretation,  although 
a  non-cooperative  user  can  often  evoke  a  false  interpretation.  A  more 
serious  problem  is  that  of  communicating  to  a  cooperative  user  the 
algorithm  used  to  interpret  the  ellipsis  notation  and  either  verifying 
that  the  first  candidate  is  the  intended  interpretation  or  else  finding 
it  by  some  interactive  procedure. 

The  internal  representation  of  the  meaning  does  not  appear  to  be  a 
problem,  and  good  ones  should  fall  out  naturally  when  an  ellipsis- 
translating  mechanism  is  incorporated  into  a  larger  program -understanding 
system. 

An  ideal  system  should,  of  course,  be  forgiving.  For  example,  it 
should  produce  the  same  interpretation  for  the  following  t  styles; 


<X1  +  +  X;  +  •••  +  *„> 

(x1+  x2+  xJ  +  ...  xn) 

(x1+x2+x.  ...  +  .xn) 

<*1  +  *2  *  x:  ' ' '  xn> 

If  the  user  provides  a  meaningfully  subscripted  last  element,  that  information 
should  be  used.  For  example,  in  (x^  ...  x  n)  the  last  element  should 

lL 

resolve  the  ambiguity  in  the  sequence  beginning  2,  4,  ...  .  Our  ideal 
system  should  also  handle  interleaved  sequences  (say,  from  different 
sources),  such  as  (x^  y0  x,  y^  ...)  ;  specified  intermediate  elements, 
such  as  (x^  x2  ...  *0^+1  •••)  >  deleted  elements,  perhaps  represented 
as  (xp  x2  ~*"i  •*'  :<n)  or  other  ways;  and  various  operators,  such 
as  +  ,  -  ,  etc. 

Waldinger  has  suggested  that  a  more  powerful  induction  mechanism  be 
used  to  allow  "formula  extrapolation",  e.g.,  to  handle  examples  such  as 
(A,  B,  AA,  AB,  BA,  BB,  ...)  .  Such  a  mechanism  could  be  of  use  in 
specifying  more  complex,  but  frequently  used,  enumeration  algorithms . 

Fusaoka  (11]  has  implemented  an  embryonic  formula  extrapolator . 


4.4  Our  Simplest  Program-understanding  Program 

The  next  program  showed  seme  rudimentary  program-understanding  behavior. 
It  dealt  with  simple  list  manipulation,  assignment  operations,  and 
arithmetic.  The  2  versions  of  the  program  were  Lenat's  RJP1  and  a 
revised  version,  FUP2,  by  Steinberg.  Both  versions  of  RJP  were  written 
in  OLISP  [26]  (the  successor  to  qa4  [27])  and  INTERLISP. 
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The  specification  of  the  program  to  be  written  is  basically  a 
formal  input-output  relation.  The  program  is  structured  around  QLISF 
goal  statements,  which  specify  both  the  desired  state  and  an  "apply'' 
list  of  subprograms  that  may  be  able  to  achieve  that  state.  A  subprogram 
may  achieve  the  goal  state  directly  or  may  decompose  the  goal  into 
subgoals  and  use  goal  statements  to  achieve  these.  We'll  describe 
several  of  the  tasks  HJP  accomplished,  along  with  a  description  of  the 
stored  facts  used  in  each  case. 

U.k.l  Interchange  of  Elements  This  is  a  simple  problem, 
similar  to  one  solved  by  Simon's  Heuristic  Compiler  [28].  The  problem 
statement  is 

initial  state  final  state 

contents (x)  =  a  contents(x)  =  b 

contents (y)  =  b  contents (y)  =  a 

The  initial  state  is  assumed  and  the  final  state  taken  as  the  goal. 

One  of  the  programs  on  the  apply  list  decomposes  goals  of  the  form  a  a  3 
into  the  separate  conjunct s  and  uses  goal  statements  to  attain  first  one, 
then  the  other,  in  a  more-or-less  depth-first  manner. 

The  program  that  handles  the  subgoal  contents (x)  =  b  sees  that 
contents (y)  ^  b  is  true  and  so  adds  x  -  y  to  the  program  being  written. 

It  also  adds  a  comment  "  x  previously  contained  a  "  at  that  point  in  the 
program  and  updates  the  world  model  to  say  that  contents (x)  =  b  now  holds. 
Next,  this  same  program  is  given  the  subgoal  contents (y)  =  a  and  finds 
that  a  no  longer  exists,  so  it  looks  back  in  the  program  to  find  where  a 
was  destroyed.  It  finds  the  comment  "  x  previously  contained  a  "  and  so 


patches  the  program  to  save  a  in  a  temporary  variable  before  it  is 
destroyed.  The  program  now  looks  like 


begin 
temp  •-  x; 

x  -  y;  comment  x  previously  contained  a  ; 

Now  a  exists  in  temp  ;  so  the  program  can  achieve  contents (y)  =  a  by 

y  -  temp;  comment  y  previously  contained  b  ; 
end; 

The  interesting  issue  here  is  whether  to  look  ahead  when  a  is 
destroyed  and  predict  that  it  will  be  needed  again,  or  to  go  back  and 
patch  if  the  need  is  discovered.  In  this  case  patching  was  much  easier 
than  predicting,  largely  because  a  comment  was  made  in  order  to  facilitate 
any  needed  patching.  (Far  better  programmers  than  HJP  use  many  comments 
for  just  that  purpose.) 

4.4.2  3 -element  Sort  This  problem,  sorting  the  contents  of  3  cells 
without  using  recursion  or  iteration,  is  non -trivial  even  for  humans. 
Experienced  programmers  can  take  several  minutes  and  often  come  up  with 
incorrect  programs.  Formally,  the  probl.em  is 

initial  state  final  state 

contents(x)  =  a  contents(x)  <  contents(y) 

contents(y)  =  b  contents(y)  <  contents(z) 

contents (z)  =  c  contents  of  x  ,  y  ,  and  z  are,  in 

some  order,  a  ,  b  ,  and  c 

No  information  is  given  about  the  ordering  of  a  ,  b  ,  and  c  .  The 
third  conjunct  of  the  goal  is  presently  handled  by  a  kludge:  nothing 


RIP  knows  how  to  do  in  achieving  the  rest  of  the  goal  changes  this 
condition.  Thus  the  goal  RJP  gets  is  actually  just 
contents(x)  <  contents(y)  a  contents(y)  <  contents(z)  . 

The  basic  method  is  to  use  case  analysis,  which  is  adequate  (although 
a  more  clever  approach  is  possible) .  The  AND  handler  begins  by  decomposing 
the  main  goal  into  its  2  subgoals.  To  achieve  contents (x)  <  contents (y) 
RJP  knows  to  try  2  things: 


(1)  Is  contents(x)  <  contents(y)  already  true?  RJP  can 
prove  that  it  is  true  if  it  has  been  explicitly 
stated  or,  since  RJP  knows  that  <  is  transitive, 

;sf  there  is  a  simple  transitivity  chain  such  that 
contents(x)  =  <*<0<...<7»  contents(y)  .  In 
either  case,  if  contents(x)  <  contents(y)  is 
already  true,  RJP  is  done. 

(2)  Is  contents(y)  <  contents(x)  ?  RJP  can  know  this  too 
by  having  it  explicitly  stated  or  from  a  transitivity 
chain.  RJP  also  knows  that  (a  <  £})  D  0  <  a  ,  so 
that  if  it  knows  -i  ( content s(x)  <~contents(y))  ,  then 
it  can  deduce  contents (y)  <  contents (x)  .  In  any 
case,  if  it  decides  contents (y)  <  contents (x)  is 
true,  HJP  interchanges  x  and  y  .  To  do  this  RJP 
calls  itself  recursively,  giving  itself  the  interchange 
problan  discussed  above  in  Section  4.4.1.  (Some  future 
version  of  RJP  should  probably  save  some  information 
about  each  problem  it  solves,  so  that  when  it  is  given 
another  similar  problem  it  has  an  easier  time.  At 
present,  however,  RJP  completely  redoes  the  interchange.) 
After  the  interchange,  RJP  interchanges  everything  it 
knows  about  x  and  y  that  depends  on  their  contents. 
That  is,  every  fact  that  refers  to  the  contents  of  x 

is  modified  to  refer  to  the  contents  of  y  and  vice 
versa. 


Unfortunately,  from  the  initial  state  none  of  the  relevant  ordering 
information  is  known,  so  the  goal  of  contents (x)  <  contents(y)  fails 
to  be  achieved  and  the  AND  handler  fails.  (A  smarter  program  might  have 
first  noticed  that  no  ordering  information  was  given  about  a  ,  b  , 
and  c  ,  and  not  attempted  either  of  the  above  steps.) 
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Failure  of  the  AND  handler  causes  the  goal-statement  mechanism  to 
try  further  programs  on  the  apply  list.  One  of  these  is  a  case-analysis 
handler.  This  program  picks  one  of  the  subgoals,  say 
contents (x)  <  contents (y)  ,  and  constructs  a  program  of  the  form 

if  x  <  y  then  subprogram^  else  subprogram^ ; 

We  note  that  the  implicit  assumption  here  that  the  <  predicate  is 
computable  should  be  made  explicit.  A  smarter  system  might  recognize 
this  program  as  a  sort  program  and  go  on  to  produce  a  nice  algorithm. 

To  find  subprogram^  ,  contents (x)  <  contents (y)  is  assumed  and 
the  entire  goal  retried.  Again  the  AND  handler  fails.  (Although  the 
first  subgoal  succeeds  since  it  is  assumed,  the  second  subgoal, 
contents(y)  <  contents(z)  ,  fails.)  Again  we  enter  the  case-analysis 
handler .  This  time  since  the  first  eubgoal  is  true  (by  assumption) ,  it 
will  not  be  picked;  so  the  second  subgoal  is  picked.  By  now,  the  first 
part  of  the  program  being  constructed  looks  like 

if  r.  <  y  then 
begin 

if  y  <  z  then 

The  entire  goal  is  again  retried.  Since  both  subgoals  are  assumed,  the 
AND  handler  succeeds  this  time,  and  this  case  is  done. 

A  point  to  note  is  that  as  each  subgoal  of  the  Aid)  goal  is  achieved, 
it  is  added  to  a  list  of  "protected"  facts.  After  each  operation  this 
list  is  checked  to  see  that  none  of  the  facts  on  it  has  been  altered. 

If  any  have,  an  immediate  attempt  is  made  to  restore  them.  This  can, 
of  course,  lead  to  infinite  loops  in  which  restoring  one  alters  another, 


restoring  that  alters  the  first,  ad  infinitum.  To  prevent  this,  at 
some  arbitrary  level  of  restoring  within  restoring,  a  cutoff  is  made 
and  failure  reported.  The  importance  of  the  process  of  restoring 
protected  facts  will  be  shown  shortly. 

How  we  do  the  else  part  of  the  innermost  if.  To  do  this  the 
assumption  contents(y)  <  contents(z)  is  removed,  and  the  assumption 
-V  contents  (y)  <  contents(z))  is  made.  Then  the  whole  goal  is  retried. 

The  first  subgoal,  still  assumed,  succeeds  and  is  added  to  the  protected 
list.  The  second  subgoal  is  tried,  and  since  contents (z)  <  contents (y) 
now  holds,  y  and  z  are  interchanged.  A  side  effect  of  this 
interchange  is  to  modify  the  fact  contents (x)  <  contents (y)  to  be 
contents(x)  <  contents(z)  . 

After  the  interchange  the  protection  list  is  checked,  and  because  of 
the  interchange  RJP  no  longer  has  the  fact  contents (x)  <  contents (y)  , 

So  an  attempt  is  made  to  restore  that  condition.  As  before,  direct  methods 
fail,  and  the  case-analysis  handler  is  invoked.  As  before,  a  conditional 
statement  is  added  to  the  program,  and  the  true  and  false  branches  are 
written  by  assuming  the  truth  and  falsehood,  respectively,  of  the 
condition.  The  true  case  results  in  the  null  program,  and  the  false 
case  results  in  an  interchange.  The  attempt  to  restore 
contents(x)  <  contents(y)  succeeds,  so  the  else  part  of  the  innermost 
if  succeeds  and  thus  the  whole  innermost  if  does  too.  The  program  now 
looks  like  this  (without  comments) : 


29 


if  x  <  y  then 
begin 

if  y  <  z  then  else 
begin 
temp1  -  y; 

y  -  z; 
z  *-  terap1; 

if  x  <  y  then  else 
begin 
temp2  -  x; 

x  -  y; 

y  -  temp2 

end 

end 

end 

else 

subprogram2 ; 

Finally  subprogram2  is  written.  All  assumptions  and  deductions 
specific  to  the  process  of  writing  subprogram^  are  removed,  and 
-i(contents(x)  <  contents(y))  is  assumed.  An  interchange  is  needed  to 
establish  the  first  subgoal,  but  otherwise  the  process  is  similar  to  that 
of  writing  subprogram^  .  The  final  program  is 


y  <  z  then  else 
begin 
terap1  -  y; 

y  -  z; 
z  *-  temp^; 

if  x  <  y  then  else 
begin 
temp2  -  x; 

x  -  y; 

y  -  temp2 

end 

end 

end 

else 

begin 

temp,.  -  x; 

P 

X  -  y; 
y  -  temp5; 

if  y  <  z  then  else 
begin 
temp^  -  y; 

y  -  z; 
z  «-  temp^; 

if  x  <  y  then  else 
begin 
temp,.  *-  x; 

x  -  y; 
y  -  temp5 

end 

end 

end; 


Integer  Square  Root  In  this  example  the  desired  program 
should  find  l/xj  >  'the  floor  of  the  square  root  of  input  x  .  This 
task  was  chosen  to  coincide  with  Manna's  tutorial  on  automatic 
programming  [21],  which  compared  the  abilities  of  existing  systems  to 


synthesize  or  verify  such  a  program.  RJP's  performance  was  gained  by 


sacrificing  formal  methods  —  and  the  associated  formal  guarantees . 

PUP  has  just  the  right  knowledge  about  numeric  functions,  number 
systems,  ordering,  maxima  and  minima,  searching,  and  the  real  square -root 
function  to  make  the  problem  interesting  yet  doable.  For  example,  RIP 
does  not  know  any  program  which  directly  computes  the  square  root  of  x  . 
However,  it  does  know  how  to  test  if  an  input  is  equal  to  the  square  root 
of  x  ,  by  comparing  the  square  of  the  input  to  x  .  And  HJP  does  have  a 
program  to  compute  the  square  of  a  number:  multiply  it  by  itself. 

Let  us  investigate  the  dialog  now.  The  user  asks  for  the  integer 
square  root  of  some  number,  say  isqrt(82)  .  Since  HJP  doesn't  recognize 
the  function  isqrt  ,  it  assumes  the  user  either  made  a  typographical 
error  or  wants  HJP  to  write  a  new  function.  The  user  settles  that 
question  in  favor  of  the  latter  alternative,  and  HJP  notices  that  there 
is  1  numeric  argument.  The  knowledge  of  numeric  functions  is  sufficient 
to  realize  that  the  domain  and  range  of  the  function  should  be  pinpointed 
if  possible.  The  user  indicates  that  both  domain  and  range  are  the 
natural  numbers.  PUP  now  picks  names  for  the  input  and  output  variables, 
say  x  and  y  ,  respectively,  and  asks  the  user  to  describe  the  function 
in  terms  of  these  variables.  The  user  replies  with 

isqrt(x)  -  max  y  such  that  y  <  square_root(x) ; 

HJP  first  considers  whether  or  not  the  condition  y  <  square_root(x) 
is  directly  testable  given  x  and  y  ,  i.e.,  whether  RIP  already  has  a 
program  which  can  do  it.  Knowledge  of  the  <  relation  says  that  the 
test  can  be  done  if  and  only  if  each  side  is  computable.  Vie  trivially 
have  the  left  side,  given  x  and  y  .  But  HJP  doesn't  have  an  algorithm 
to  compute  squarejroot(x)  ,  so  we  must  look  deeper  for  the  right  side. 


Knowledge  of  inequalities  says  to  fix  this  up  by  finding  an  inverse 
function  of  square_root  ,  say  i  ,  and  by  replacing  the  old  inequality 
by  i(y)  <  x  .  A  warning  note  says  that  such  an  inverse  must  be  computable 
(and  in  addition  both  the  inverse  and  the  original  function  must  be 
monotone);  otherwise,  we're  no  better  off  than  before.  The  main  fact 
about  square_root  is  that  its  inverse  is  achieved  by  squaring.  Both 
the  square_root  and  square  functions  have  tags  indicating  monotonicity. 
Also,  square  is  known  to  be  computable,  so  the  problem  statement  is  now 
reformulated  as 

isqrt(x)  •-  max  y  such  that  square(y)  <  x; 

The  second  problem  is  whether  an  algorithm  is  already  known  which 
computes  the  maximum  element  in  the  range  of  a  given  predicate,  fjiowledge 
about  max  includes  only  1  algorithm:  start  by  choosing  the  upper  bound 
of  the  range  and  then  iterate,  decrementing  the  candidate  each  time,  until 
the  predicate  is  satisfied.  Knowledge  of  the  natural  numbers  says  that  an 
upper  bound  does  not  exist,  so  this  straightforward  method  won't  work. 
Fortunately,  max  knows  a  transformation  of  itself  when  the  predicate 
is  monotone  and  the  range  is  a  segment  of  the  integers: 
max  y  such  that  p(y)  becomes  min  y  such  that  -ip(y  +  1)  .  Both 
the  conditions  are  verified  in  our  case,  so  the  change  is  tentatively 
made,  and  the  problem  statement  becomes 

isqrt(x)  •- min  y  such  that  -,(square(y  +  1)  <  x); 

(Notice  that  RJP  implicitly  assumes  that  the  negation  of  a  computable 
predicate  is  computable.  This  should  probably  be  made  explicit.)  Knowledge 
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of  negation  allows  the  replacement  of  -><  by  >  at  this  point,  and 
we  get 


isqrt(x)  «-  min  y  such  that  square (y  +  1)  >  x; 

Now  algorithms  for  computing  min  are  examined.  The  only  one  says 
to  start  at  the  lower  bound  of  the  range  and  repeatedly  increment  unti1 
the  predicate  is  satisfied.  Knowledge  of  natural  numbers  informs  us  that 
a  lower  bound  is  0  .  RJP  converts  this  to  the  final  code: 

isqrt(x)  -  isqrt^O,  x) ; 

isqrt^y,  x)  «-  if  square(y  +  1)  >  x  then  y  else  isrirt^y  +  1,  >:) 

RJP  enters  the  program  in  its  records,  recalls  the  original  request 
for  isqrt(82)  ,  and  runs  the  new  program  on  it. 

Notice  the  flavor  of  RJP's  operation:  locating  relevant  information, 
which  either  provides  some  of  the  final  code  or  points  to  more  information 
which  is  needed.  It  is  the  structuring  of  this  knowledge  which  beats  the 
combinatorial  explosion  of  searching  for  relevant  facts. 

4.5  Examples  Program 

This  program,  called  EXAMPLE,  infers  recursive  LISP  functions  from 
single  example  input -out  jut  pairs.  The  program  was  written  in  INTERLISP 
by  Shaw  and  later  revised  by  William  Swartout.  The  inductive  inference 
of  functions  from  example  i/O  pairs  has  also  been  explored  by 
J.  C.  R.  Licklider  [1]  and  Hardy  [14]. 
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As  a  typical  problem  solved  by  EXAMPLE,  given  the  example  I/O  pair 

input  output 

(A  B  C  D)  - *  (DDCCBBAA) 

it  synthesizes  the  "reverse  and  double"  function 

f(x)  -  if  null(x)  then  nil  else 

append(f (cdr(x) ) ,  lisfc(car(x),  car(x))); 


EXAMPLE  can  infer  a  class  of  functions  which  can  be  approximately 
characterized  as  simple  list-to-list  transformations.  A  somewhat  more 
precise  characterization  of  the  class  is  that  each  function  recurs  along 
an  input  list  (or  lists)  and  produces  some  part  of  the  output  (possibly 
empty)  for  each  step  of  the  recursion.  These  pieces  of  the  output  are 
assembled  into  the  output  list  without  any  reordering  (with  the  possible 
exception  of  completely  reversing  the  output) .  At  each  step  of  the 
recursion,  a  similar  recursive  subfunction  can  be  used  to  produce  that 
step's  portion  of  the  output.  There  can  be  several  input  arguments,  and 
the  function  written  can  be  recursive  in  any  number  of  arguments. 

As  an  example,  consider  the  I/O  pair 


The  output  is  produced  in  5  steps  as  indicated.  A  recursive  subfunction 
produces  the  sublists  (1,  2,  and  5  shown  above)  in  successive  steps,  and 


the  main  function  appends  them  together.  EXAMPLE  can  synthesize  this 
function  and  variations,  such  as  having  the  output  reversed  or  the  same 
output  but  with  each  sublist  reversed. 

The  program  works  as  follows.  Consider  the  synthesis  of  the  function 
discussed  above.  Call  it  f  .  First  EXAMPLE  decides  how  much  of  the 
output  is  produced  in  the  firs~  step  of  the  recursion  (referred  to  as  the 
recursive  head) .  Thus,  in  the  example  above,  it  decides  that  the  first 
sublist  (A  B)(A  C)(A  D)  is  produced  in  the  first  step  and  is  the  recursive 
head.  (Th  heuristic  by  which  it  decides  this  is  interesting  and  is 
discussed  later.)  Next  it  sets  up  the  subproblem  of  synthesizing  the 
code  that  produces  the  head.  This  can  be  thought  of  as  specifying  a 
sub  function,  although  in-line  code  may  be  used  if  no  recursion  is  necessary. 
In  cur  example  a  recursive  subfunction,  call  it  f^  ,  is  required.  First 
the  arguments  of  f  are  selected.  In  this  case  EXAMPLE  chooses  2 

arguments  for  f^  ,  car  of  the  input,  A  ,  and  cdr  of  the  input, 

(BCD)  .  Obviously  f1  just  lists  car  of  the  input  with  each  of  the 
elements  of  the  cdr  .  After  the  inputs  are  set  up,  the  subfunction  is 
written  in  the  same  manner  as  the  main  function,  by  a  recursive  call  to 
EXAMPLE.  Returning  to  the  synthesis  of  the  main  function,  there  are  5 
remaining  steps:  (1)  the  terminating  conditions  are  selected; 

(2)  the  results  from  each  recursive  step  are  joined  properly,  using 
either  cons  or  append  ;  and  (3)  the  recursive  call  of  the  main 
function  is  formed.  The  recursive  call  can  be  on  the  cdr  ,  cddr  , 

cdddr  ,  etc.  For  example,  in  (A  B  C  D  E  F) - >  (A  C  E)  the  recursive 

call  is  on  the  cddr  of  the  input. 
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The  program  written  for  (A  B  C  D)  — *  ((A  B)(A  C)(A  D)(B  C)(B  D)(C  D)) 


is 


f(x)  -  if  null(x)  then  nil  else 

if  null(cdr(x))  then  nil  else 
append(f1(car(x),  cdr(x)),  f(cdr(x))); 

f.  (y,  z)  »-  if  null(z)  then  nil  else 
~  cons(list(y,  car(z)),  f-^y,  cdr(z))); 


EXAMPLE  is  fairly  complex,  but  we  will  describe  one  interesting  part, 
namely  the  heuristic  that  decides  where  to  break  the  output  list  into  the 
recursive  head  and  the  rest.  The  output  list  is  scanned  left  to  right  (and 
possibly  right  to  left  if  necessary),  looking  for  a  simple  progression. 

When  a  large  change  is  encountered,  this  point  is  proposed  as  the  break. 

In  our  example,  (A  B  C  D)  - *  ((A  B)(A  C)(A  D)(B  C)(B  D)(C  D))  ,  the 

pattern  (A  next_input)  ,  where  next_input  signifies  the  successive 
elements  in  the  input  past  A  (i-e.,  B,  C,  and  D  ),  is  discovered 
to  match  the  first  5  elements  of  the  output  but  not  (B  C)  ,  so  the  break 
occurs  before  (B  C)  .  This  heuristic,  along  with  many  others,  such  as 
determining  when  to  write  a  subfunction  and  the  number  of  arguments  for 
a  subfunction,  works  fairly  well. 

The  following  examples  are  ones  for  which  a  reasonable  program  was 
automatically  generated.  Some  1-input  examples  are 


input 


output 


(A  B  C  D) 

(A  B  C) 

(A  B  C  D) 

(  A  B  C  D  E  F) 

l  a  t>  n  n  w\ 
\n  ju  v  w  u  r ) 

(A  B  C  D  E  F) 
(A  BCD) 

(A  B  C  D) 

(A  B  C  I)j 
(ABCD) 

(A  B  C  D  E  F) 


(D  C  B  A) 

(A  A  B  B  C  C) 

(D  D  C  C  B  B  A  A) 

(A  C  E) 

(E  C  A) 

(B  D  F) 

(  (a)  (b)  (c)  (d)) 

((a  b) (a  c)(a  d)(b  c)(b  d)(c  d)) 

(A3CDBCDCDD) 

(DCBADCBDCD) 

(B  A  D  C  F  E) 
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Some  2 -input  examples  are 


input  1 

input  2 

FN 

(A  B  C  D)  - » 

(A  B  C) 

(D  E  F)  - ) 

(A  B  C) 

(D  E)  - » 

(A  B  C) 

(D  E)  - > 

(A  B  C) 

(D  E  F)  - y 

output 

((IN  A)(FM  B)(M  C)(IN  D)) 

(A  D  B  E  C  F) 

(adbdcdaebec  e) 

((A  D)  (A  E)(B  DV(5  E)(c  D)(C  E)) 
((A  D)  (B  S)(C  Fj) 


The  limitations  of  the  system  are 


(1)  Only  the  position  of  an  element,  and  not  its  identity, 
is  considered  in  deciding  what  to  do  with  it.  Thus  a 
revei*se  program  can  be  written,  but  a  sort  cannot. 

(2)  On  the  input,  only  top-level  list  recursions,  as  opposed 
to  tree  recursions,  are  attempted.  Thus  the  flatten 

function  fe.g.,  (A  B  (C  (B  E)  F)  0)  - >  (A  BCDEF  G-)  ] 

is  not  possible. 

(3)  The  organization  of  the  program  makes  extension  into  new 
areas  reasonably  difficult.  We  plan  to  reorganize  the 
program  and  to  add  cleverer,  domain-specific  facts  to 
increase  its  power. 


1.?  Synthesis  of  Large  Inductive -inference  Programs 

Cur  next  system,  EUP5  by  Lenat.,  represents  an  attempt  at  the  synthesis 
of  larger,  more  domain-specific  programs.  The  system  was  designed  to 
write  concept-formation  programs,  a  class  of  programs  which  inductively 
infer  the  definition  of  a  concept  from  a  number  of  instances  of  that 
concept  [18].  The  original  target  program  to  be  synthesized 
semi-automatically  was  SPOT,  a  small  version  of  Winston's  concept-formation 
program  [34]  without  its  fancy  graph-matching  algorithm,  written  by- 
Peter  Gadwa  at  Stanford  University.  SPOT  was  specifically  designed  to 
be  a  simple  (5-page),  yet  still  interesting  program.  During  the  course 


of  the  design  of  RJP5,  the  target  program  evolved  into  a  somewhat 
different  program. 

HJP5  is  still  only  an  experimental  vehicle,  but  it  has  proved 
moderately  successful.  It  has  indeed  written  a  concept-formation  program 
similar  to  the  intended  one,  although  augmented  by  self -documentation. 

RJP5  is  being  revised  to  write  a  wider  class  of  inductive-inference 
programs.  The  next  target  program  is  a  simple  grammatical-inference 
program,  upon  which  work  should  be  completed  shortly. 

Although  the  system  is  written  entirely  in  INTERLISP,  many  popular 
AI -language  features  [5]  (e.g.,  pattern  matching,  assertions,  goal 
direction,  apply  teams,  backtracking,  special  data  types,  demons,  etc.) 
were  hand  coded  expressly  for  this  system.  The  entire  100  pages  of  code 
is  organized  as  an  interacting  community  of  small  units,  called  beings . 
Although  complex,  the  structure  of  each  being  is  the  same:  a  set  of  answers 
to  about  30  fixed  questions.  These  questions,  called  the  being  parts, 
represent  "everything  you  always  wanted  to  know  about  a  small  program". 
Neither  the  exact  set  chosen  nor  the  number  30  is  very  important ;  the 
approximate  size  of  the  set  is  relevant  to  automatic  programming,  however. 
Each  being  part  is  itself  a  little  program  which  knows  what  the  30  questions 
are  and  which  may  ask  any  being  any  question  it  wants  to.  Since  some 
beings  must  write  target  code,  we  choose  to  have  each  being  x  write  all 
code  similar  to  x  .  For  example,  the  sort  being  contains  a  costly 
"big  switch"  hooked  to  various  sorting  algorithms,  but  the  code  it  writes 
in  any  specific  instance  will  be  a  tailor-written  implementation  of  a 
particular  sort  algorithm. 

Although  RJP5  insists  on  doing  structured  programming  (hence  uses 
something  like  macro  expansion) ,  its  control  structure  employs  feed  forward, 
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feedback,  backtracking,  and  a  contextual  assertion  base.  One  bit  of 
inherent  philosophy  is  that  the  system  should  defer  making  all  decisions 
as  long  as  possible.  We  hope  that  by  this  deferral,  along  with  careful 
record  keeping,  we  can  eliminate  most  of  the  carelessness  "bugs"  that 
typically  arise  in  humans  as  a  result  of  brain-hardware  limitations.  This 
is  in  contrast  to  earlier  versions  of  RJP  [see  Section  4.4],  which  viewed 
debugging  as  the  predominant  part  of  programming.  Thus,  RJP5  rarely 
believes  it  is  finished  if  in  fact  it  has  overlooked  some  details. 

We  now  present  (most  of)  the  current  parts  of  a  being: 


name 


description 


identity 
arguments 
argument_c  heck 
evaluate_arguments 

what 

why 

how 

effects 

when 

meta_code 

comments 

requisites 


demons 


how  the  being  is  referenced  in  English 
sentences 

which  arguments  are  required  and  which 
are  optional 

predicate  which  examines  each  argument 
for  suitability 

which  arguments  of  the  being  and  in  the 
code  generated  by  the  being  should 
be  evaluated 

brief  summary  of  what  the  being  does 
justification  for  the  being's  existence: 
why  it  is  called 

summary  of  the  method(s)  used  by  the 
being  to  do  its  thing 
postconditions  which  will  be  true  after 
calling  the  being 

factors  and  weights  telling  how  apropos 
the  being  is  right  now 
body  of  the  code,  but  with  uninstantiated 
subparts 

aid  to  filling  in  the  mcta_code 
what  must  be  actively  satisfied  just 
before  (prerequisites),  during 
(corequisites),  and  just  after 
(postrequisites)  the  being  is 
executed 

which  demons  should  be  enabled  during 
the  being's  execution 
which  other  beings  might  be  called  by 
this  being 


affects 


name 


description 


complexity 


specializations 

alternatives 

generalizations 

predicate 
data  structure 


encodable 

inhibit_current 

demons 

form  changing 


vector  describing  such  features  as 
recursiveness,  overall  cost, 
chance  of  failing,  transparency 
to  user,  etc. 

what  must  be  known  to  write  a  streamlined 
version  of  this  being 
equivalent  beings  in  case  this  one 
doesn’t  work 

more  general  beings  in  case  none  of  the 
alternative  beings  works 
what  type  of  values  the  being  returns 
if  being  is  a  data  structure,  how  it 
is  initialized  and  accessed,  how 
elements  are  inserted  and  deleted 
description  of  the  flow  of  control  in 
writing  a  specialized  new  being 
enable/ inhibit  mechanism  for  demons 

where  in  the  being  tree  this  being  can 
directly  return  to 


Although  each  being  has  about  50  answers,  each  of  which  might  contain 
several  facts,  only  about  10  facts  from  any  given  being  are  actually 
employed  during  the  course  of  the  program-writing  dialog.  A  typical 
programming  being  is  obtain_usable_information  .  Its  when  being  part 
says  that  calling  this  being  is  generally  undesirable,  but  may  oe  the 
only  reasonable  course  to  follow  if  there  exists  new  information  which  is 
not  directly  usable.  Its  how  being  part  says  to  choose  (creating  a 
non-deterministic  backtrack  point)  from  among  these:  translate,  get 
totally  new  raw  information,  extract  a  small  subset  of  existing  raw 
information  to  concentrate  upon,  or  analyze  the  implications  of  a  small 
set  of  existing  raw  information.  A  typical  domain-specific  being  is 
partition_a_domain  .  It  specializations  being  part  says  to  find  out 
whether  the  partition  is  partial  or  total,  whether  it  is  weak  or  strong, 
and  whether  it  is  built  by  repeatedly  accepting  (element,  class  name) 
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pairs  and/or  accepting  an  element  (then  guessing  and  verifying  its 
class  name)  and/or  accepting  a  class  name  (then  guessing  and  verifying 
its  element (s)). 

The  dialog  involved  in  a  RJP5  run  is  carried  on  in  a  miniscule 
subset  of  English.  Since  it  encompasses  precisely  the  sentences  which 
the  user  wants  to  say,  the  dialog  gives  the  illusion  of  being  unconstrained. 
However,  the  term  "the  user"  is  not  generic  as  there  has  only  been  1  user 
so  far.  The  interaction  system  works  by  each  being  recognizing  and 
processing  phrases  referring  to  it.  The  dialog  for  synthesizing  the 
concept -format ion  program  takes  several  hours  of  console  time.  Much 
of  the  interaction  is  unnecessary:  RJP5  asks  the  user  to  name  things 
which  are  never  referenced  again.  This  annoyance  is  being  worked  on. 

A  promising  sign  of  programming-knowledge  convergence  is  that  out  of 
67  programming  beings  50  are  used  by  HJP5  during  the  course  of  writing 
both  of  the  target  programs  (concept  formation  and  grammatical  inference) . 
Future  plans  for  RJP5  work  include  studying  the  various  types  of  knowledge 
needed  for  programming,  inductive  inference,  and  specific  target  programs. 
This  will  (hopefully)  be  done  by  extending  RJP5  to  handle  more  and  bigger 
tasks . 

h  .7  Sorting 

During  the  past  year.  Green  and  Barstow  have  attempted  to  isolate 


and  codify  those  "facts"  of  programming  knowledge  which  are  necessary  for 
a  system  which  can  understand  and  write  simple  iterative  sorting  programs » 
To  keep  the  working  domain  small,  such  techniques  as  recursion  and  exchange 


sorting  (e.g.,  bubble  sort)  and  such  fast  algorithms  as  quicksort  [lo] 
and  heapsort  [8,  33]  were  explicitly  excluded  from  consideration.  In  the 
course  of  this  attempt,  it  became  apparent  that  many  concepts  were 
involved  and  needed  to  be  analyzed.  The  present  set  of  facts  is  a  list 
of  100  rules  which  deal  with  sorting  and  permutations,  generators  for 
explicitly  given  sets,  set  constructors,  and  several  types  of  generate- 
and-test  methods.  The  rules  allow  for  either  array  or  list  representations 
of  sets.  There  are  at  present  no  rules  regarding  efficiency  considerations 
or  formal  verification  of  correctness.  This  we  consider  a  shortcoming, 
and  Elaine  Kant  has  recently  begun  studying  the  addition  of  rules  for 
optimization. 

One  interesting  aspect  of  our  list  of  rules  is  that  it  covers  a  wide 
range  of  levels.  As  an  example  of  the  range  covered,  there  are  rules 
dealing  with  the  choice  between  selection  and  insertion  sorts,  with 
state-saving  schemata  for  generators,  with  the  choice  of  variable  names, 
and  with  the  addition  of  elements  to  the  front  of  a  list.  One  initial 
goal  of  our  work  was  to  have  each  rule  be  relatively  simple  and  explicit; 
we  feel  that  we  have  been  moderately  successful  in  this  regard.  Thus, 
these  rules  provide  a  knowledge  base  for  a  program-writing  system,  and  it 
is  the  interaction  of  these  rules  which  provides  the  foundation  for  the 
system’s  ’’understanding"  of  sort  programs. 

The  rules  have  been  'ganized  in  a  goal/ subgoal  fashion,  with  the 
capabilities  of  disjunctive  and  sequential  subgoals  and  subgoaling  by 
cases.  A  preliminary  implementation  of  a  system  based  upon  these  rules 
has  been  completed.  Each  rule  has  been  written  as  an  INTERLISP  function. 
The  control  system  consists  of  several  other  functions  which  describe 
the  efforts  of  the  system  as  it  writes  a  program,  ask  for  choices  at 
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OR-rule  junctures,  and  provide  limited  additional  explanatory  information 
on  request  (e.g.,  a  why  function  to  exp  ain  the  purpose  of  a  section  of 
the  final  program) .  The  traces  tend  to  be  overly  verbose,  but  confirm 
our  belief  that  the  rules  can  form  the  basis  of  an  understanding  system. 

It  should  be  emphasized  that  this  system  was  primarily  a  "quick  and 
dirty"  effort,  intended  as  a  device  for  testing  and  refining  rules,  rather 
than  as  a  program-writing  system.  One  test  of  the  rules  is,  of  course, 
adequacy,  and  the  system  has  successfully  written  3  substantially  different 
programs:  a  reverse  program,  a  selection  sort,  and  an  insertion  sort. 
Although  not  all  of  the  variations  have  been  completed  to  date,  we  expect 
that  with  perhaps  20  additional  rules  our  system  should  be  capable  of 
generating  a  few  dozen  distinct  (although  in  many  cases  similar)  programs. 
The  programs  produced  are  generally  about  1  page  in  length  (using  the 
INTERLISP  prettyprint  function  as  a  standard  of  measurement) . 

We  feel  that  this  line  of  research  has  been  fruitful  and  plan  to 
continue  it  in  the  future.  It  is  our  expectation  that  such  a  structuring 
of  knowledge  will  make  possible  the  incremental  addition  of  rules  for 
other  aspects  of  low-level  programs  and  that  any  additional  rules  will 
use  many  of  the  present  rules  as  subgoals. 
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